UAMCLyR at RepLab 2014: Author Profiling Task
نویسندگان
چکیده
This paper describes the participation of the Language and Reasoning Group of UAM at RepLab 2014 Author Profiling evaluation lab. This task involves author categorization and author ranking subtasks. Our method for author categorization uses a supervised approach based on the idea that we can use the information on Twitter’s user profile, then by means of employing an attribute selection techniques we can extract attributes that are the most representative from each user’s activity domain. For the author ranking subtask we use a two step chained method that uses stylistics attributes (e.g. lexical richness, language complexity) and behavioral attributes (e.g. posts’ frequency, directed tweets) extracted from the users’ profile and the posts. We use these attributes in conjunction with a Markov Random Fields for improving an initial ranking given by the confidence of Support Vector Machine classification algorithm. Obtained results are encouraging and motivate us to keep working on the same ideas.
منابع مشابه
UNED at CLEf RepLab: Author Profiling
This paper describes a learning system developed for the RepLab 2014 author profiling task at UNED. The system uses a voting model, which employs a small set of features based mainly on the tweet text information such as POS tags, number of hashtags or number of links. In the unofficial run, the feature set was increased with Twitter metadata such as number of followers or retweet speed. The sy...
متن کاملUAMCLyR at RepLab 2013: Profiling Task
This paper describes the participation of the Language and Reasoning Group of UAM at RepLab 2013 Profiling evaluation lab. We adopted Distributional Term Representations (DTR) for facing the following problems: i) filtering tweets that are related to an entity, and ii) identifying positive or negative implications for the entity’s reputation, i.e., polarity for reputation. Distributional Term R...
متن کاملUAMCLyR at Replab2013: Monitoring Task
In this article we deal with the Topic Detection and Priority Detection subtasks from RepLab 2013, trying clustering and classification methods as well as term selection techniques in order to know its performance in two sub collections of tweets: single and extended (single tweet plus derived tweets). Our tests show good performance in spite of we used very few resources.
متن کاملUniversity of Tehran at RepLab 2014
In this paper, we present our approach to author ranking subtask; which is a part of author-profiling task in RepLab 2014. In this subtask, systems are expected to detect influential authors and opinion makers on Twitter website. The systems’ output, for a given domain, must be a ranked list of authors according to their probability of being an influential author or opinion maker. Our system ut...
متن کاملOverview of the Author Profiling Task at PAN 2014
This overview presents the framework and the results for the Author Profiling task at PAN 2014. Objective of this year is the analysis of the adaptability of the detection approaches when given different genres. For this purpose a corpus with four different parts (subcorpora) has been compiled: social media, Twitter, blogs, and hotel reviews. The construction of the Twitter subcorpus happened i...
متن کامل